Estimating inflation in GWAS summary statistics due to variance distortion from cryptic relatedness

نویسندگان

  • Dominic Holland
  • Chun-Chieh Fan
  • Oleksandr Frei
  • Alexey A. Shadrin
  • Olav B. Smeland
  • V. S. Sundar
  • Ole A. Andreassen
  • Anders M. Dale
چکیده

Cryptic relatedness is inherently a feature of large genome-wide association studies (GWAS), and can give rise to considerable inflation in summary statistics for single nucleotide polymorphism (SNP) associations with phenotypes. It has proven difficult to disentangle these inflationary effects from true polygenic effects. Here we present results of a model that enables estimation of polygenicity, mean strength of association, and residual inflation in GWAS summary statistics. We show that there is substantial residual inflation in recent large GWAS of height and schizophrenia; correcting for this reduces the number of independent genome-wide significant loci from the reported values of 697 for height and 108 for schizophrenia to 368 and 61, respectively. In contrast, a larger GWAS of educational attainment shows no residual inflation. Additionally, we find that height has a relatively low polygenicity, with approximately 8k SNPs having causal association, more than an order of magnitude less than has been reported. The residual inflation in GWAS summary statistics can be corrected using the standard genomic control procedure with the estimated residual inflation factor.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Estimating phenotypic polygenicity and causal effect size variance from GWAS summary statistics while accounting for inflation due to cryptic relatedness

Of signal interest in the genetics of traits are estimating the proportion, π1, of causally associated single nucleotide polymorphisms (SNPs), and their effect size variance, σ β , which are components of the mean heritabilities captured by the causal SNP. Here we present the first model, using detailed linkage disequilibrium structure, to estimate these quantities from genome-wide association ...

متن کامل

Population Structure in Genetic Association Studies

Standard genetic association tests using case-control data are based on certain assumptions about the population from which study subjects were sampled. Two types of departure from these assumptions have been studied: population stratification and cryptic relatedness. Both types of departure have been called population structure. Each can lead to erroneous inferences due to differences between ...

متن کامل

Cryptic relatedness in epidemiologic collections accessed for genetic association studies: experiences from the Epidemiologic Architecture for Genes Linked to Environment (EAGLE) study and the National Health and Nutrition Examination Surveys (NHANES)

Epidemiologic collections have been a major resource for genotype-phenotype studies of complex disease given their large sample size, racial/ethnic diversity, and breadth and depth of phenotypes, traits, and exposures. A major disadvantage of these collections is they often survey households and communities without collecting extensive pedigree data. Failure to account for substantial relatedne...

متن کامل

Estimating the Total Number of Susceptibility Variants Underlying Complex Diseases from Genome-Wide Association Studies

Recently genome-wide association studies (GWAS) have identified numerous susceptibility variants for complex diseases. In this study we proposed several approaches to estimate the total number of variants underlying these diseases. We assume that the variance explained by genetic markers (Vg) follow an exponential distribution, which is justified by previous studies on theories of adaptation. O...

متن کامل

Estimating Effect Sizes and Expected Replication Probabilities from GWAS Summary Statistics

Genome-wide Association Studies (GWAS) result in millions of summary statistics ("z-scores") for single nucleotide polymorphism (SNP) associations with phenotypes. These rich datasets afford deep insights into the nature and extent of genetic contributions to complex phenotypes such as psychiatric disorders, which are understood to have substantial genetic components that arise from very large ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017